Generalization in a Model of Infant Sensitivity to Syntactic Variation
نویسنده
چکیده
Computer simulations show that an unstructured neuralnetwork model (Shultz & Bale, 2001) covers the essential features of infant differentiation of simple grammars in an artificial language, and generalizes by both extrapolation and interpolation. Other simulations (Vilcu & Hadley, 2003) claiming to show that this model did not really learn these grammars were flawed by confounding syntactic patterns with other factors and by lack of statistical significance testing. Thus, this model remains a viable account of infant ability to learn and discriminate simple syntactic structures. One of the enduring debates in cognitive science concerns the proper theoretical account for human cognition. Should cognition be interpreted in terms of symbolic rules or subsymbolic neural networks? It has been argued that infants’ ability to distinguish one syntactic pattern from another could only be explained by a symbolic rule-based account (Marcus, Vijayan, Rao, & Vishton, 1999). After being familiarized to sentences in an artificial language having a particular syntactic form (such as ABA), infants preferred to listen to sentences with an inconsistent syntactic form (such as ABB). The claim about the necessity of rulebased processing was promptly contradicted by a number of neural-network modelers, several of whom produced unstructured models that captured the basic finding of more interest in novel than familiar syntactic patterns (Altmann & Dienes, 1999; Elman, 1999; Negishi, 1999; Shultz, 1999; Shultz & Bale, 2001; Sirois, Buckingham, & Shultz, 2000). However, Vilcu and Hadley (2001, 2003) reported that two of these simulations (Altmann & Dienes, 1999; Elman, 1999) could not be replicated. Vilcu and Hadley (2003) were able to replicate the results of one simulation (Shultz & Bale, 2001). But Vilcu and Hadley (2003) claimed that their extensions of this model failed to generalize, both in terms of interpolation within the training range and extrapolation outside of this range. They concluded that this model did not really learn the grammars. The present paper contains new simulations establishing that this model (Shultz & Bale, 2001) does indeed learn the simple grammars used in the infant experiments, interpolating and extrapolating successfully. The Original Simulations Shultz and Bale (2001) used an encoder version of the cascade-correlation (CC) learning algorithm to simulate the infant data. CC is a constructive algorithm for learning from examples in feed-forward neural networks (Fahlman & Lebiere, 1990). Being a constructive algorithm, CC builds its own network topology as it learns by recruiting new hidden units as needed. New hidden units are recruited one at a time and installed each on a separate layer. The candidate hidden unit that is recruited is the one whose activations correlate most highly with the current error of the network. CC has been used to simulate many aspects of psychological development (Shultz, 2003). For such developmental simulations, there are a number of advantages of constructive learning algorithms over static networks that only adjust connection weights, but do not grow during learning (Shultz, 2005a; Shultz, Mysore, & Quartz, 2005). Like other encoder networks, the encoder version of CC learns to reproduce its inputs on its output units. Discrepancy between inputs and outputs is considered as error, which CC attempts to reduce. Infants are thought to construct an internal model of stimuli to which they are being exposed, and then differentially attend to more novel stimuli that deviate from their models. Encoder networks are capable of simulating this attention preference. Network error is often used as an index of stimulus novelty in these simulations. The three-word sentences used in the infant experiments were coded by Shultz and Bale (2001) with a continuous sonority scale, shown in Table 1, based on previous phonological research (Vroomen, van den Bosch, & de Gelder, 1998). Sonority is the quality of vowel likeness and it has both acoustic and articulatory aspects. In Table 1, it can be seen that sonorities ranged from -6 to 6 in steps of 1, with a gap and change of sign between consonants and vowels. Each word in the three-word sentences used in the infant experiments was coded on two input units for the sonority of the consonant and the sonority of the vowel. For example, the sentence ga ti ga was coded on the network inputs as (-5 6 -6 4 -5 6). The consonant /g/ was coded as -5, and the vowel /a/ as 6, yielding (-5 6) for the word ga, which was the first and last word in this sentence. The consonant /t/ was coded as -6, and the vowel /i/ was coded as 4, yielding a code of (-6 4) for the ti word. Likewise the sentence ni ni la was coded on the inputs as (-2 4 -2 4 -1 6). The original simulation captured the essential features of the infant data including exponential decreases in attention to a repeated syntactic pattern, more interest in sentences inconsistent with the familiar pattern than in sentences consistent with that pattern, occasional familiarity preferences, more recovery to consistent novel sentences than to familiar sentences, and generalization both outside and inside of the range of the training patterns (Shultz & Bale, 2001).
منابع مشابه
On the generalization of Trapezoid Inequality for functions of two variables with bounded variation and applications
In this paper, a generalization of trapezoid inequality for functions of two independent variables with bounded variation and some applications are given.
متن کاملTHE IMPACT OF TEACHING SUMMARIZING ON EFL LEARNERS’ MICROGENETIC DEVELOPMENT OF SUMMARY WRITING
Summary writing is associated with lots of cognitive and metacognitive complexities that necessitates instruction (Hirvela & Du, 2013). Contrary to majority of studies carried out on summarization instruction, the present study addressed the underlying processes or microgenetic developments of the Iranian EFL learners’ summary writing. To this end, 41 male and female undergraduate students rece...
متن کاملThe analysis of residuals variation and outliers to obtain robust response surface
In this paper, the main idea is to compute the robust regression model, derived by experimentation, in order to achieve a model with minimum effects of outliers and fixed variation among different experimental runs. Both outliers and nonequality of residual variation can affect the response surface parameter estimation. The common way to estimate the regression model coefficients is the ordinar...
متن کاملA generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملمدل ترجمه عبارت-مرزی با استفاده از برچسبهای کمعمق نحوی
Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005